Project description

You work at a startup that sells food products. You need to investigate user behavior for the company's app. First study the sales funnel. Find out how users reach the purchase stage. How many users actually make it to this stage? How many get stuck at previous stages? Which stages in particular? Then look at the results of an A/A/B test. (Read on for more information about A/A/B testing.) The designers would like to change the fonts for the entire app, but the managers are afraid the users might find the new design intimidating. They decide to make a decision based on the results of an A/A/B test. The users are split into three groups: two control groups get the old fonts and one test group gets the new ones. Find out which set of fonts produces better results. Creating two A groups has certain advantages. We can make it a principle that we will only be confident in the accuracy of our testing when the two control groups are similar. If there are significant differences between the A groups, this can help us uncover factors that may be distorting the results. Comparing control groups also tells us how much time and data we'll need when running further tests. You'll be using the same dataset for general analytics and for A/A/B analysis. In real projects, experiments are constantly being conducted. Analysts study the quality of an app using general data, without paying attention to whether users are participating in experiments.

Prepare the data for analysis

I have loaded the logs, changed the columns to small letters, found out that 0.17% of the data is duplicates, and droped them. Also created two other columns for datetime and date. Finally checked if users are on both groups, but there weren't.

Study and check the data

How many events are in the logs?

How many users are in the logs?

What's the average number of events per user?

What period of time does the data cover?

We can see from the histogram above that the date started to be complete from August 1st, as we see a cyclical pattern where data is complete, and before the 1st of August the count if much less then the minimum count after that date. So I am removing the data before August 1st, which then the data will represent a week: From 1st of August To 7th of August.

Did you lose many events and users when excluding the older data?

We have lost 2826 events (243713-240887=2826) which is 0.99% of the original data.

We have lost 17 users which is 1% of the data

Make sure you have users from all three experimental groups.

Study the event funnel

See what events are in the logs and their frequency of occurrence. Sort them by frequency.

MainScreenAppear has the highest frequency, followed by OffersScreenAppear, CartScreenAppear, PaymentScreenSuccessful and lastly Tutorial.

Find the number of users who performed each of these actions. Sort the events by the number of users. Calculate the proportion of users who performed the action at least once.

Per the above the MainScreenAppear event was performed at least once by 98.47% of the total users. Tutorial was performed at least once by 11.15% of users.

In what order do you think the actions took place. Are all of them part of a single sequence? You don't need to take them into account when calculating the funnel.

The way I see it the sequence is: MainScreenAppear--->OffersScreenAppear--->CartScreenAppear--->PaymentScreenSuccessful. Tutorial is not part of the funnel as it has too little people who clicked it, maybe because the tutorial is optional.

Use the event funnel to find the share of users that proceed from each stage to the next. (For instance, for the sequence of events A → B → C, calculate the ratio of users at stage B to the number of users at stage A and the ratio of users at stage C to the number at stage B.)

We can see that 62% of users from MainScreenAppear proceed to OffersScreenAppear of Those 81% proceed to CartScreenAppear nad finally 95% of those make a purchase.

At what stage do you lose the most users?

We lose most users in the MainScreenAppear page as only 62% procced from there to the next stage.

What share of users make the entire journey from their first event to payment?

Per the above 48% of people reached the payment screen.

Study the results of the experiment

How many users are there in each group?

We have two control groups in the A/A test, where we check our mechanisms and calculations. See if there is a statistically significant difference between samples 246 and 247.

H0: The proportions of the two groups are equal.

H1:The proportions of the two groups are different.

We can see that in the A/A test there was no statistical difference between the groups, this is great as we can say now that the test was done correctly.

Do the same thing for the group with altered fonts. Compare the results with those of each of the control groups for each event in isolation. Compare the results with the combined results for the control groups. What conclusions can you draw from the experiment?

The was no statistical difference in the A/B test between 246 and 248, the font did no have a statistical effect

The was no statistical difference in the A/B test between 247 and 248, the font did no have a statistical effect

The was no statistical difference in the A/B test between 247+246 and 248, the font did no have a statistical effect. In conclusion the font in general did not have a statistical effect.

What significance level have you set to test the statistical hypotheses mentioned above? Calculate how many statistical hypothesis tests you carried out. With a statistical significance level of 0.1, one in 10 results could be false. What should the significance level be? If you want to change it, run through the previous steps again and check your conclusions.

I have initially used 0.05 as the significance level, but even if changing it to 0.005 still the we fail to reject the null hypothesis.

In conclusion, I have loaded the logs and prepered the data by removing duplicates and checking NaN values, also changing the dtypes and creating a timedate and date column.

I also noticed that the data is incomplete before the 1st of August, so I have drop that and left with a week of data.

I found out that 0.17% of the data is duplicates, and droped them and checked if users are on both groups, but there weren't.

Also found that the correct sequence of events is MainScreenAppear--->OffersScreenAppear--->CartScreenAppear--->PaymentScreenSuccessful.

After analyzing the results of the A/A/B test I found that there is not statistical difference between the two groups which means that the fonts did not make a difference to user engagment. Therefore it might be more cost effective to keep the fonts, or if we really wanted to change them, at least we know there will be no effect on the users.